feat(backend): Lifecycle Manager + Session Manager lane by harshitsinghbhandari · Pull Request #2 · aoagents/ReverbCode

harshitsinghbhandari · 2026-05-26T14:30:55Z

Summary

This branch is the complete Lifecycle Manager (LCM) + Session Manager (SM) lane of the backend rewrite — the deterministic core that supervises agent sessions and keeps one true status per session. It was built contract-first on a single integration branch; the sub-PRs below merged into this branch, and it now lands on main as one cohesive unit.

Full documentation lives in docs/: start with docs/architecture.md, and see docs/status.md for what's done vs. what's left.

What it delivers

A per-session OBSERVE → DECIDE → ACT loop where OBSERVE lives outside the LCM (observers call in) and the LCM is a synchronous reducer:

domain/ — the canonical state model (CanonicalSessionLifecycle, persisted) and DeriveLegacyStatus, the sole producer of the display status (derived on read, never stored).
domain/decide/ — the pure, total, deterministic decision core (liveness reconciliation, the PR ladder, the anti-flap detecting quarantine), table-tested to 100%.
ports/ — the stabilized boundary: inbound LifecycleManager/SessionManager, outbound LifecycleStore/Notifier/AgentMessenger/Runtime/Agent/Workspace, and the SCMFacts/RuntimeFacts/ActivitySignal DTOs.
lifecycle/ — the LCM: the Apply* pipeline (load → decide → diff → persist), per-session serialization, the liveness-vs-activity composition rules, and the reaction table + escalation engine (TickEscalations).
session/ — the SM: spawn / kill / restore / cleanup / list, with eager rollback and the worktree-remove safety.

Folded-in PRs (merged into this branch)

contracts + CI workflows (feat(backend): Lifecycle Manager + Session Manager lane #2 base)
pure DECIDE core — feat(decide): pure DECIDE core + exhaustive truth-table tests #4
LCM pipeline (split A) — feat(lifecycle): LCM Apply* pipeline (split A) #5
LCM reactions + escalation (split B) — feat(lifecycle): ACT layer — reaction table + escalation engine (split B) #6
Session Manager — feat(session): Session Manager — spawn/kill/list/get/send/restore/cleanup #7
docs

Key invariants

Persist canonical state; derive the display status on read.
One authority for death — only the decide pipeline (via detecting) writes inferred terminal states; the SM's explicit-kill path goes through OnKillRequested.
Failed probe ≠ dead; PR facts dominate the soft session states once a PR exists; worktree-remove safety on teardown.

Status

Implemented and tested entirely behind in-memory fakes — gofmt / go build / go vet / go test -race green across domain, decide, lifecycle, and session. The SM tests drive the real LCM for spawn/kill round-trips.

Not in this lane (integration phase, tracked in #3): swapping fakes for real adapters (persistence/CDC, SCM poller, runtime/agent/workspace plugins, notifier, API), plus three carried-forward items — the react() out-of-lock dispatch ordering, the unused ExpectedRevision CAS for multi-writer/CDC, and a real implementation of the store's new Seed/Get. See docs/status.md for the full hand-off.

Test plan

cd backend && gofmt -l . && go build ./... && go vet ./... && go test -race ./... all green
Review the cumulative diff vs. main as one lane (sub-PRs were reviewed against the integration branch)

🤖 Generated with Claude Code

Contract-first boundary for the Lifecycle Manager + Session Manager lane. Pure shapes only — types, interfaces, and the display-status derivation — so adil (SCM poller), Tom (persistence), and aditi (API) can review and build against a stable boundary before any behaviour lands. domain/ - CanonicalSessionLifecycle: the only persisted state (session/pr/runtime sub-states), with Activity + Detecting sub-states added as decider inputs that must survive between observations. - DeriveLegacyStatus: the sole producer of the derived display status (never persisted), with 11 table tests. ports/ - inbound: LifecycleManager (Apply* pipeline, per-session serialised) and SessionManager. - outbound: LifecycleStore (Tom), Notifier, AgentMessenger, and the Runtime/Agent/Workspace plugin ports (co-owned with the agents lane). - facts: SCMFacts / RuntimeFacts / ActivitySignal DTOs. decide/ pure-core signatures + I/O types; bodies stubbed for the next PR. Folds in four design-review fixes (documented in-code, pending team confirm): 1. Activity + Detecting persisted so the pure decider has memory across calls. 2. Per-session serialisation documented; LifecyclePatch.ExpectedVersion offers optimistic-locking as an alternative. 3. LifecyclePatch is a sparse pointer-field merge-patch (+ ClearDetecting). 4. SCMFacts gains Fetched (failed fetch != "PR closed") and per-comment IsBot (bot vs human route to different reactions). go build / go vet / go test all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

This PR introduces the initial contract-first Go packages for the backend “Lifecycle Manager (LCM) + Session Manager (SM)” lane, defining the canonical lifecycle domain model, inbound/outbound ports, and a single pure status-derivation function (with tests) to stabilize boundaries for parallel implementation work.

Changes:

Added domain canonical lifecycle/state enums, Session read-model, and DeriveLegacyStatus (plus table-style tests).
Added ports package with inbound interfaces (LCM/SM), outbound interfaces (store/notifier/runtime/agent/workspace), and fact DTOs (SCM/runtime/activity/spawn/kill).
Added domain/decide package with pure-core I/O shapes and stubbed function bodies for a follow-up PR.

Reviewed changes

Copilot reviewed 8 out of 8 changed files in this pull request and generated 8 comments.

Show a summary per file

File	Description
backend/internal/ports/outbound.go	Outbound ports for persistence, notifications, and runtime/agent/workspace plugins
backend/internal/ports/inbound.go	Inbound contracts for LifecycleManager and SessionManager
backend/internal/ports/facts.go	DTOs (“facts”) crossing the LCM/SM boundary (SCM, runtime, activity, spawn/kill)
backend/internal/domain/status.go	Derived display status enum + `DeriveLegacyStatus`
backend/internal/domain/status_test.go	Tests for `DeriveLegacyStatus`
backend/internal/domain/session.go	API read-model `Session` definition
backend/internal/domain/lifecycle.go	Canonical persisted lifecycle model + enums/substates
backend/internal/domain/decide/decide.go	Decide-core input/output shapes and tuning constants (stubbed bodies)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Review feedback on PR #2: - Add CanonicalSessionLifecycle.Revision (monotonic write counter) distinct from the schema Version; LifecyclePatch.ExpectedVersion -> ExpectedRevision now compares it, so optimistic locking actually works. - LifecycleStore.List returns []domain.SessionRecord (persistence shape, no derived status); add SessionRecord and make Session embed it. Keeps the Session Manager the single producer of the derived display status. - SpawnOutcome.RuntimeHandle is now the structured ports.RuntimeHandle, not a string, so Destroy/SendMessage get the handle without ad-hoc encoding. - Agent.IsProcessRunning -> ProbeProcess returning ProcessProbe (liveness), not domain.ActivityState; the name no longer implies a boolean. - Document LifecyclePatch Detecting vs ClearDetecting precedence (clear wins). - Correct the DeriveLegacyStatus doc: hard session states outrank PR facts; "PR dominates" applies only to the soft idle/working states. Implementation was already correct (matches canonical AO); only the comment overstated it. - Replace personal-name attributions in package/interface comments with role-based terms (SCM poller / persistence adapter / API layer). go build / go vet / go test all pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- go.yml: gofmt check, build, vet, and race-enabled tests for backend/, triggered on backend changes and pushes to main. - gitleaks.yml: secret scanning on PRs and main using gitleaks-action v1 (license-free; v2 requires GITLEAKS_LICENSE for org repos). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…tests Replace the stubbed deciders in domain/decide with real, total, side-effect-free implementations: - ResolveProbeDecision: kill-intent short-circuits to terminal; failed probes and probe disagreement route to detecting; only runtime-dead + process-dead + no-recent-activity concludes killed. - ResolveOpenPRDecision: the PR pipeline ladder (ci_failing > changes_requested > approved+mergeable > approved > review_pending > idle-beyond > open). - ResolveTerminalPRStateDecision: merged -> idle/merged_waiting_decision, closed -> idle. - CreateDetectingDecision: anti-flap quarantine — unchanged-evidence counter with StartedAt preserved across the episode so the duration cap is a real wall-clock safety net; escalates to stuck at 3 ticks or 5m. - HashEvidence: strips timestamps/epochs and collapses whitespace before hashing so restamped-but-unchanged signals compare equal. Table tests cover every branch (100% statement coverage), including a consistency check that the open-PR ladder's display Status matches DeriveLegacyStatus over the canonical state it emits. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ency test Address PR review (aa-1): - ResolveOpenPRDecision now keys MERGEABLE on Mergeable alone (checked before Approved). Mergeability is the authoritative merge gate, so a PR on a no-required-review repo no longer falls through to PR_OPEN. The approved+mergeable and approved-only cases are unchanged. - Broaden the derive-consistency test to cover the probe and terminal deciders too, not just the open-PR ladder. - Document the HashEvidence epoch-stripping regex's breadth. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ting helper Address PR review (aa-1): - Move the decider input/output type definitions (LifecycleDecision, ProbeInput, ProcessLiveness, OpenPRInput, DetectingInput) out of the execution file into a dedicated types.go, leaving decide.go for behaviour. - Expand the detecting() helper doc to spell out that it adapts a probe verdict into the shared CreateDetectingDecision anti-flap path so each probe branch doesn't re-implement the quarantine counter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Clarify the timestamp-stripping block flagged in review: spell out what each of the three regexes matches (full ISO/RFC3339 datetime, bare time-of-day, bare unix epoch) and why order matters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… contract Address Copilot review: - ResolveOpenPRDecision now sets a stable, timestamp-free Evidence summary "<condition> #<num> <url>" for every ladder outcome, consuming the previously-unused OpenPRInput.Number/URL identity inputs and making PR decisions traceable in logs. Covered by TestResolveOpenPRDecisionEvidence. - Document LifecycleDecision's zero-value contract: an empty PRState/PRReason means "this decider does not address PR — leave unchanged", not PRNone. The LCM must map empty PR fields to a nil LifecyclePatch.PR; writing PRNone on a probe tick would clobber a live PR. (Pointers were considered but the empty sentinel is distinguishable from every valid state and consistent with the codebase's value-enum style; LifecyclePatch already owns nil-means-unchanged.) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(decide): pure DECIDE core + exhaustive truth-table tests

Implements ports.LifecycleManager as a synchronous observe->decide->persist reducer. Every entrypoint runs the shared pipeline under a per-session lock: load canonical -> run the matching pure decider -> diff into a sparse merge-patch -> persist. Never polls, never writes the display status. - ApplyRuntimeObservation -> probe decider; always writes the runtime axis. - ApplySCMObservation -> open-PR / terminal-PR deciders (failed fetch is a no-op: failed probe != "no PR"). Open PRs write only the PR axis. - ApplyActivitySignal -> updates the activity axis + maps onto the session axis; only valid-confidence signals are authoritative. - OnSpawnCompleted -> runtime alive + handles to metadata; session stays not_started (display: spawning). - OnKillRequested -> SM's explicit terminal-write authority. - TickEscalations -> no-op stub (reaction/escalation engine is split B). Composition rule (#1): liveness owns the runtime + death axis; activity owns the working/idle/waiting axis. A healthy probe verdict writes the session axis only to recover a liveness-owned state, so it never clobbers an activity-owned needs_input/blocked. Activity is the mirror: it stays off the death axis. Detecting clear (#3): a non-detecting probe verdict clears stale detecting memory so the next probe reads no phantom Prior. Built/tested against in-memory fakes (LifecycleStore with full merge-patch + ExpectedRevision, recording Notifier/AgentMessenger). Per-session serialisation verified under -race. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

- shouldWriteSessionRuntime: never resurrect a terminal session; an observation may refresh the runtime axis but must touch neither the session axis nor the detecting memory (gated in ApplyRuntimeObservation). - OnSpawnCompleted: error on an unseeded session instead of fabricating a partial record (SM must seed first — a missing seed is a contract violation). - OnKillRequested: no-op on an unknown/already-gone session (benign race) instead of fabricating a terminal record. - keyedMutex: reference-count entries and evict on last release so the lock map stays bounded in a long-running daemon. - runtimeSubstateFromFacts: map RuntimeProbeIndeterminate to RuntimeUnknown with a neutral reason, distinct from the probe_error of a failed probe. Adds tests for terminal non-resurrection, unseeded spawn-completed error, and unknown-session kill no-op. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Address Harshit's PR #5 review (approve w/ design confirms + polish): - #1 (design decision): a valid activity signal is proof of life, so it now resolves a detecting session — writes the activity-mapped session state and clears the quarantine memory. Scoped to detecting only; a liveness-escalated stuck stays the probe pipeline's to resolve. Terminal still never reopens. - #2: document why a merged/closed PR parks the session axis even over an activity-owned needs_input/blocked (a merge is a milestone), unlike the open-PR path that defers to activity. - #3: map plain idle activity to a neutral session reason instead of the misleading research_complete (kept for ready, which implies completion). - #6: cover all three kill kinds (manual/cleanup/error), the open-PR review branches (changes_requested/mergeable/review_pending), and the neutral idle reason. Coverage 86.5% -> 88.6%. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(lifecycle): LCM Apply* pipeline (split A)

…t B) Add the ACT half of the LCM: map persisted status transitions to reactions (send-to-agent / notify / auto-merge) and drive escalation. - reactions.go: the §4.2 default reaction table, reactionEventFor (mirrors DeriveLegacyStatus for the ACT layer), in-memory per-(session,reaction) escalation trackers, the react() dispatch chokepoint, and a real TickEscalations for duration-based escalations the synchronous LCM can't wake itself for. auto-merge action exists but is off by default; bugbot-comments/merge-conflicts are configured but dormant (no decide-core producer yet). - manager.go: mutate now returns a transition; each Apply* path fires the mapped reaction after persist via the single synchronous react() seam. OnKillRequested intentionally does not react (explicit kill != inferred event). Split-A load->decide->diff->persist behavior is unchanged. ci-failed budget is persistent across fail->pending->fail oscillation; non-persistent trackers reset when the status leaves the triggering state. Escalation silences further auto-dispatch until the condition clears. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

review) Address review finding #1: the persistent ci-failed tracker leaked and could stale-silence a future regression. It was only cleared when leaving the ci-failed reaction AND incidentOver held at that moment — so a recovery to another open-PR state (ci-failed -> approved -> merged) never cleared it. - react() now clears ALL of a session's trackers when the state REACHED is incident-over (PR resolved / session terminal) OR a genuine recovery (approved/mergeable, which the open-PR ladder guarantees means CI is no longer failing). Keyed on the state reached, not the one left, since the recovery transition is typically review_pending->approved (empty beforeKey). - Persistent ci-failed still survives the ambiguous review_pending limbo, so fail->pending->fail keeps one shared budget (§4.2). - Document the out-of-lock react() dispatch caveat for the daemon integration step (review #2) and the intentionally-skipped agent-stuck 10m threshold. Tests: re-arm after a genuine recovery (regression re-nudges, not silenced); all session trackers cleared once the incident is over. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…escalation (Copilot review) - OnKillRequested now clears the session's escalation trackers after a successful kill, so a later duration-based TickEscalations can't emit reaction.escalated for a dead session (dispatch is still skipped). - sendToAgent rolls back the attempt (and firstAttemptAt when it set it) on a messenger.Send error, so undelivered messages don't march a reaction toward escalation — honoring "send failures retry next tick" (§4.3). - Duration escalation now uses an inclusive boundary (>=) in both shouldEscalate and TickEscalations, so a 30m reaction escalates at exactly 30m instead of waiting for the next tick. Tests: kill clears trackers + no post-kill escalation; repeated failed delivery never escalates; duration escalation fires at exactly escalateAfter. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(lifecycle): ACT layer — reaction table + escalation engine (split B)

…store/cleanup) Implements ports.SessionManager against fakes for the outbound ports. The SM is the explicit-mutation half of the lane: it drives Runtime/Agent/Workspace, seeds the initial lifecycle, and routes outcomes to the LCM (OnSpawnCompleted / OnKillRequested). It never derives observed state and is the single producer of the derived display status (attached on read, never persisted). - Spawn: Workspace.Create -> Runtime.Create (AO_* identity env) -> Seed -> OnSpawnCompleted, with eager rollback of completed steps on failure. - Kill: OnKillRequested first -> Runtime.Destroy -> Workspace.Destroy, honoring the worktree-remove safety (refusal surfaced, never forced). - List/Get: derive status via DeriveLegacyStatus. Send: via AgentMessenger. Restore: re-seed (reopen) + relaunch via GetRestoreCommand. Cleanup: reclaim terminal sessions, skip worktrees holding uncommitted work. Store-contract additions (co-owned with Tom's persistence layer, flagged for review): LifecycleStore.Seed (explicit create-with-identity; OnSpawnCompleted requires a seeded record) and LifecycleStore.Get (single record-with-identity read; Load is lifecycle-only). Lifecycle test fake updated to satisfy both. Tests route through the real LCM Manager (wrapped to record call order). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ntime on post-create failure (PR #7 review) - Restore now fails early with a clear error if MetaAgentSessionID is missing, rather than emitting an ambiguous "resume nothing" launch command (no stored prompt means a fresh-launch fallback isn't possible). - On a post-runtime-create failure (reopen patch or OnSpawnCompleted), best-effort destroy the newly created runtime (never the workspace, which holds prior work) so we don't strand a live process while parking the session terminal. - Added a test for Restore with a missing agent session id: errors early, touches no workspace/runtime, leaves the session terminal. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…e rollback (review follow-ups) Adds an injectable OnSpawnCompleted failure to the recording LCM and two tests: - Spawn: when OnSpawnCompleted fails, the seeded record is parked terminal/errored (via OnKillRequested(KillError)) and runtime+workspace are torn down. - Restore: when OnSpawnCompleted fails post-create, the new runtime is destroyed while the workspace is preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

feat(session): Session Manager — spawn/kill/list/get/send/restore/cleanup

Add docs/ for newcomers: an index, an architecture deep-dive (the OBSERVE→DECIDE→ACT loop, the canonical state model, the package layout, every component, and the load-bearing invariants), and a status/roadmap (what's done PR-by-PR, what's left, the integration to-dos + carried-forward items, the open cross-lane contract questions, and where to plug in). Link them from the README. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Copilot

Pull request overview

Copilot reviewed 25 out of 25 changed files in this pull request and generated 4 comments.

Address PR #2 Copilot review comments on the merged LCM+SM lane: - session: validate runtime handle + workspace path before Kill/Cleanup teardown; refuse (ErrIncompleteTeardownMetadata) or skip rather than hand empty args to a real adapter's Destroy (unsafe delete). - session: reject Restore unless the session is terminal (ErrNotRestorable) so a live session can't spawn a duplicate runtime/workspace. - ports: document SpawnConfig.OpenTerminal as reserved/not yet honored. - lifecycle: remove the unread reactionConfig.auto field; note approved-and-green is notify-only (human decides to merge). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

fix(session): harden teardown/restore safety + drop dead reaction flag

harshitsinghbhandari · 2026-05-27T08:56:38Z

All 4 review comments addressed in #8 (f03c7c8), now merged into this branch — resolving the threads. Verified on the current tip (CI build-test + scan green; local gofmt/build/vet/test -race green):

Kill/Cleanup teardown — ErrIncompleteTeardownMetadata: required handles (workspace path, runtime id/name) are validated before recording intent or calling any adapter Destroy, so an empty-arg (unsafe) delete can't happen on a partial/orphaned record.
Restore — ErrNotRestorable: Restore now rejects a non-terminal session, preventing a duplicate runtime/workspace for a live id.
SpawnConfig.OpenTerminal — documented as reserved / not yet honored by Spawn (planned terminal-lane feature), so callers aren't misled.
reactionConfig.auto — removed the dead/derivable flag (the only auto:false entry is a notify, which is the human-decision path; auto-merge stays off).

…viewer to worker harness Per PR #197 review feedback: - Reviewer agent posts its review to the PR itself, so remove the ports.PRReviewPoster port, the GitHub review poster, the submit HTTP route + DTO, and the service Submit method (#1, #4, #7). - Trigger spawns the reviewer agent over the worker's worktree with its own review prompt, mirroring the session launch flow (resolve agent by harness -> argv -> runtime.Create) (#8, #9). - Default reviewer harness reuses the worker's harness when supported, falling back to claude-code; reviewer config stays independent of the worker override (#5, #6). - Drop the `ao review` CLI for this PR's scope (#2, #3). Regenerated OpenAPI + TS types. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(review): configurable AO code review backend (V1) Add per-project configurable code review of a worker's PR. A reviewer agent runs one-shot over the worker's own worktree and posts its result to the PR; the worker picks the feedback up through the existing SCM observer review-nudge path. - domain: ProjectConfig.reviewers (+ default reviewer harness), Review / ReviewRun types and verdict/status vocab. - storage: review + review_run tables (0011), sqlc queries, store methods. - service/review: rewrite the in-memory stub as a persisted ReviewService (Trigger/Submit/List) with a reviewer Runner over agent resolver + runtime; ports.PRReviewPoster implemented on the GitHub adapter. - http: session-scoped routes POST /sessions/{id}/reviews/trigger, POST .../submit, GET .../reviews; regenerated OpenAPI + TS types. - cli: ao review trigger|submit|list. - frontend: adapt ReviewDashboard to the per-worker reviews API. Closes #192 Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): address review — drop submit/poster/CLI, default reviewer to worker harness Per PR #197 review feedback: - Reviewer agent posts its review to the PR itself, so remove the ports.PRReviewPoster port, the GitHub review poster, the submit HTTP route + DTO, and the service Submit method (#1, #4, #7). - Trigger spawns the reviewer agent over the worker's worktree with its own review prompt, mirroring the session launch flow (resolve agent by harness -> argv -> runtime.Create) (#8, #9). - Default reviewer harness reuses the worker's harness when supported, falling back to claude-code; reviewer config stays independent of the worker override (#5, #6). - Drop the `ao review` CLI for this PR's scope (#2, #3). Regenerated OpenAPI + TS types. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(review): restore ao review submit (records verdict+body in AO) Per maintainer request, bring back `ao review submit`. AO records the reviewer's verdict and body on the review_run and marks the pass complete; it does not post to GitHub — the reviewer agent posts its review to the PR itself. - storage: add review_run.body (0011), persist via Insert/UpdateReviewRunResult. - service: restore Submit (no SCM poster) storing verdict + body. - http: restore POST /sessions/{id}/reviews/submit + SubmitReviewInput. - cli: ao review submit [worker] --verdict --body (worker from arg/--session/$AO_REVIEW_WORKER). - runner: reviewer prompt instructs posting to GitHub and recording via ao review submit. Regenerated OpenAPI + TS types. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): move reviewer runner to its own package; sharpen prompt Per PR #197 review: - Move the concrete reviewer runner out of the service layer into a new internal/review_runner package (package reviewrunner), beside other orchestration packages like session_manager. The service keeps only the Runner interface + RunSpec it depends on; the agent-resolver + runtime launch flow lives in review_runner. - Sharpen the reviewer prompt: tell the agent to diff against the PR base, focus on high-confidence findings, post via `gh pr review`, and record the result with `ao review submit`; review-only (no commits/edits). - Add unit tests for the runner. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): simplify review_run schema; provider-agnostic reviewer prompt Per PR #197 review: - review_run: status default 'running' (drop 'pending'), drop CHECK constraints on status/verdict, drop the updated_at column and the session/iteration index. Propagated through queries, domain, store, service, and tests. - Reviewer prompt no longer hardcodes GitHub/gh commands — it instructs the agent to use whatever review tooling the provider offers, keeping the flow extensible across SCM providers. Regenerated sqlc + OpenAPI/TS. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): launch reviewer before persisting the run Trigger now spawns the reviewer agent first and then writes the review_run with a status derived from the launch outcome (running on success, failed if it never started), instead of inserting a running row and correcting it to failed afterwards. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): pluggable reviewer registry distinct from worker harnesses Reviewers are now their own pluggable adapter set, separate from the worker agent registry — adding a reviewer (claude-code today, greptile tomorrow) is a one-line registration that does not widen the worker harness vocabulary, and a worker harness does not automatically become a valid reviewer. - domain.ReviewerHarness: a distinct vocabulary (AllReviewerHarnesses) with its own IsKnown; ReviewerConfig/Review/ReviewRun use it. ResolveReviewerHarness reuses the worker harness only when it is itself a supported reviewer, else falls back to claude-code. - ports.Reviewer: a reviewer-specific contract (ReviewCommand → argv + env) that models one-shot / non-prompt CLIs natively instead of forcing every reviewer through the worker's interactive GetLaunchCommand(Prompt:...). - internal/adapters/reviewer: a separate registry + resolver (mirrors the worker agent registry) with the claude-code reviewer adapter, which owns the review prompt and reuses the worker claude-code launch construction. - review_runner resolves via the reviewer registry (not the worker AgentResolver) and merges AO_REVIEW_WORKER into the adapter's env. - daemon wires the reviewer resolver. Registry/domain parity is test-enforced. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * test(review): cover run-scoped reviewer submit * fix(api): update generated review submit schema * refactor(review): split core engine (internal/review) from API service Move the review orchestration (Trigger/Submit/List, run-id generation, deps, RunSpec/Runner, sentinels) into a transport-independent core package internal/review (Engine). internal/service/review is now a thin API-flow boundary: the controller-facing Manager interface + a Service that delegates to the engine + error re-exports. This keeps the service layer to API concerns and lets the same engine back a future in-process CLI trigger without going through HTTP. review_runner now depends on the core package; daemon builds the engine and wraps it in the service. No API/schema changes. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * feat(review): commit-aware trigger, reviewer handle for UI, no env vars Reworks the review trigger lifecycle and drops env-based coupling: - review_run gains target_sha (the reviewed commit) and drops iteration. A repeat trigger for the same PR head short-circuits to the existing run. - review gains reviewer_handle_id: the live reviewer pane's runtime handle, reused across passes and exposed in the reviews API so the UI can attach its terminal over /mux. - Trigger flow: if a live reviewer pane exists and a new commit arrived, message it to re-review; otherwise spawn a fresh reviewer. The run is recorded only after the reviewer is launched. - No environment variables: the reviewer adapter embeds the explicit `ao review submit --session <w> --run <id>` command in the spawn prompt and the re-review message. CLI submit requires --run/--session (no env fallbacks). - Merge review_runner into internal/review as a Launcher (spawn/notify/alive). - Trigger returns 201 for a new pass, 200 when reusing an existing run. Regenerated sqlc + OpenAPI/TS. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): author the reviewer prompt centrally, not in the adapter Mirror the worker model (session_manager builds the prompt; adapters just place it via LaunchConfig.Prompt). The reviewer prompt now lives in internal/review/prompt.go and is passed through ports.ReviewInvocation.Prompt; the claude-code reviewer adapter just feeds inv.Prompt to its launch command and returns it as the re-review message. One-shot CLI reviewers may ignore it. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> * refactor(review): split reviewer prompt into system+task, mirroring buildSpawnTexts Mirror session_manager.buildSpawnTexts for the reviewer: a standing role goes in the system prompt, the per-pass task (PR/commit + exact `ao review submit` command) goes in the user prompt. internal/review/prompt.go now returns (prompt, systemPrompt); both flow through ports.ReviewInvocation and the claude-code adapter places them via LaunchConfig{Prompt, SystemPrompt}. The re-review message reuses the per-pass prompt (role already established in the running pane). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.8 <noreply@anthropic.com> Co-authored-by: Vaibhaav <user@example.com> Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>

harshitsinghbhandari requested a review from Copilot May 26, 2026 15:17

Copilot started reviewing on behalf of harshitsinghbhandari May 26, 2026 15:17 View session

Copilot AI reviewed May 26, 2026

View reviewed changes

harshitsinghbhandari and others added 2 commits May 26, 2026 20:55

harshitsinghbhandari mentioned this pull request May 26, 2026

Tracking: Lifecycle Manager + Session Manager lane #3

Closed

8 tasks

harshitsinghbhandari mentioned this pull request May 26, 2026

feat(decide): pure DECIDE core + exhaustive truth-table tests #4

Merged

6 tasks

harshitsinghbhandari and others added 8 commits May 27, 2026 00:06

style(decide): gofmt the timestampPatterns doc block

cdfcdd2

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Merge pull request #4 from aoagents/session/aa-2

217c50e

feat(decide): pure DECIDE core + exhaustive truth-table tests

harshitsinghbhandari mentioned this pull request May 26, 2026

feat(lifecycle): LCM Apply* pipeline (split A) #5

Merged

7 tasks

harshitsinghbhandari and others added 3 commits May 27, 2026 01:49

Merge pull request #5 from aoagents/session/aa-3

347eb16

feat(lifecycle): LCM Apply* pipeline (split A)

harshitsinghbhandari mentioned this pull request May 26, 2026

feat(lifecycle): ACT layer — reaction table + escalation engine (split B) #6

Merged

4 tasks

harshitsinghbhandari and others added 6 commits May 27, 2026 02:26

Merge pull request #6 from aoagents/session/aa-4

f70368b

feat(lifecycle): ACT layer — reaction table + escalation engine (split B)

harshitsinghbhandari mentioned this pull request May 27, 2026

feat(session): Session Manager — spawn/kill/list/get/send/restore/cleanup #7

Merged

8 tasks

Merge pull request #7 from aoagents/session/aa-5

6cf7c04

feat(session): Session Manager — spawn/kill/list/get/send/restore/cleanup

harshitsinghbhandari changed the title ~~feat(backend): LCM + Session Manager contract package~~ feat(backend): Lifecycle Manager + Session Manager lane May 27, 2026

harshitsinghbhandari requested a review from Copilot May 27, 2026 08:40

Copilot started reviewing on behalf of harshitsinghbhandari May 27, 2026 08:40 View session

Copilot AI reviewed May 27, 2026

View reviewed changes

Comment thread backend/internal/session/manager.go

Comment thread backend/internal/session/manager.go

Comment thread backend/internal/ports/inbound.go Outdated

Comment thread backend/internal/lifecycle/reactions.go

harshitsinghbhandari mentioned this pull request May 27, 2026

fix(session): harden teardown/restore safety + drop dead reaction flag #8

Merged

Merge pull request #8 from aoagents/fix/lcm-sm-review-comments

b44aca4

fix(session): harden teardown/restore safety + drop dead reaction flag

harshitsinghbhandari merged commit cdfd97c into main May 27, 2026
2 checks passed

This was referenced May 27, 2026

Tracking: LCM+SM runner — reaper, daemon wiring, Runtime/Workspace adapters #9

Closed

reaper: two known gaps from PR #29 (process-axis collapsed; no per-probe timeout) #32

Open

neversettle17-101 mentioned this pull request Jun 3, 2026

refactor(backend): LLD maintainability fixes in controllers/service layers (#95) #96

Merged

4 tasks

harshitsinghbhandari mentioned this pull request Jun 4, 2026

feat: add provider-neutral SCM observer #76

Merged

This was referenced Jun 8, 2026

feat(config): persist per-project agent config and resolve it at spawn #154

Merged

feat: add notifications v1 #181

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(backend): Lifecycle Manager + Session Manager lane#2

feat(backend): Lifecycle Manager + Session Manager lane#2
harshitsinghbhandari merged 25 commits into
mainfrom
feat/lcm-sm-contracts

harshitsinghbhandari commented May 26, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

harshitsinghbhandari commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

harshitsinghbhandari commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What it delivers

Folded-in PRs (merged into this branch)

Key invariants

Status

Test plan

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

harshitsinghbhandari commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

harshitsinghbhandari commented May 26, 2026 •

edited

Loading